Speaker verification based on broad phonetic categories

نویسندگان

  • Sachin S. Kajarekar
  • Hynek Hermansky
چکیده

In this work we present a speaker verification system based on 4 broad phonetic categories: vowels+diphthongs, fricatives, glides+nasals, and silence+stops. Using these categories separately, it is observed that vowels, diphthongs, and fricatives are the most important categories for speaker verification. This observation confirms the results from the analysis of speaker and channel variability in speech. Using NIST speaker verification evaluation data, the performance of the phone based system is compared with the conventional speaker verification system based on Gaussian mixture model (GMM). The results show that the phone-based system outperforms the conventional system specifically when there is channel mismatch between training and testing data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Recognition and Broad Phonetic Groups

The aim of this study is to provide a quantitative assessment of the speaker discriminating properties of broad phonetic groups. GMM based approach to speaker modelling is used in conjunction with a phonetically handlabelled speech database (TIMIT) to produce broad phonetic group ranking based on speaker identification scores. The broad phonetic groups nasals and vowels were found to be particu...

متن کامل

Mixture of Auto-Associative Neural Networks for Speaker Verification

The paper introduces a mixture of auto-associative neural networks for speaker verification. A new objective function based on posterior probabilities of phoneme classes is used for training the mixture. This objective function allows each component of the mixture to model part of the acoustic space corresponding to a broad phonetic class. This paper also proposes how factor analysis can be app...

متن کامل

Investigation of Frame Alignments for GMM-based Text-prompted Speaker Verification

The frame alignment acts as an important role in GMM-based speaker verification. In text-prompted speaker verification, it is common practice to use the transcriptions to align speech frames to phonetic units. In this paper, we compare the performance of alignments from hidden Markov model (HMM) and deep neural network (DNN), using the same training data and phonetic units. We incorporate a pho...

متن کامل

Speaker verification with limited enrollment data

New methods for speaker veri cation that address the problems of limited training data and unknown telephone channel are presented. We describe a system for studying the feasibility of telephone based voice signatures for electronic documents that uses speaker veri cation with a xed test phrase but very limited data for training speaker models. We examine three methods for speaker veri cation t...

متن کامل

Gaussian mixture modelling of broad phonetic and syllabic events for text-independent speaker verification

This paper examines the usefulness of a multilingual broad syllable-based framework for text-independent speaker verification. Syllabic segmentation is used in order to obtain a convenient unit for constrained and more detailed model generation. Gaussian mixture models are chosen as a suitable modelling paradigm for initial testing of the framework. Promising results are presented for the NIST ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001